System And Method For Facilitating Use Of Commercial Off-The-Shelf (COTS) Components In Radiation-Tolerant Electronic Systems

ABSTRACT

A method for selecting components in a radiation tolerant electronic system, comprising, determining ionizing radiation responses of COTS devices under various radiation conditions, selecting a subset of the COTS devices whose radiation responses satisfy threshold radiation levels, applying mathematical models of the COTS devices for post-irradiation conditions to determine radiation responses to ionizing radiation; implementing a radiation-tolerant architecture using COTS devices from the selected subset, the implemented circuit may be tested for robustness to ionizing radiation effects without repeated destructive tests of the hardware circuit by using the mathematical models for simulating response to the ionizing radiation, and implementing a multi-layer shielding to protect the implemented circuit under various radiation conditions.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Patent Application No.62/810,237 filed Feb. 25, 2019, the entirety of which is incorporatedherein by reference.

FIELD

The present matter relates to the field of radiation tolerant electronicsystems, and more particularly to methods and systems for a radiationtolerant architecture facilitating use of commercial off the shelf(COTS) components.

BACKGROUND

In high radiation environments, such as nuclear power plants,aeronautical systems, space systems, and military environments.electronic systems are usually designed with radiation hardened(rad-hardened) components to withstand ambient radiation levels.However, if massive increases in radiation occur over the ambientlevels, as for example the release of radioactive substances in anuclear power plant accident, it is important for the electronic systemsto continue operating. Failure of monitoring instruments in theseradiation environments may pose a challenge to monitoring importantinformation in these environments. For example, in a nuclear plantdisaster, such as the Fukushima Daiichi nuclear disaster, one of thebiggest challenges after the accident for first responders was to obtainup-to-date status about the radiation sites and key safety relatedsystems due to lack of operational monitoring instruments. Knowledge ofimportant data indicative of conditions which may include amongst othersradiation levels, water levels, humidity, gas levels, hydrogenconcentrations, and temperatures.

An approach to monitoring, as for example in nuclear power plants (NPP),is to use separate wireless technologies, so termed, post-accidentmonitoring system (PAMS) to relay information about the environmentalconditions, such as reactor integrity, and environment in the vicinityof the NPP as existing communication infrastructure may likely bedamaged. However, even with this approach in the event of a severenuclear accident, a significant amount of radiation may be released dueto failure of protection layers, which may include alpha (α), beta (β)particles, gamma (γ) rays, x-rays, and neutron particles. Consider theFukushima accident as an example, in March 2012, the level of radiationparticles was estimated to up to 73 Sv/h (Sievert/hour) inside thecontainment of No.2 reactor, and in February 2017, this was up to 530Sv/h. Studies have indicated that electronic components made ofsemiconductor materials may start to degrade when the ambient radiationlevel becomes higher than 10 Sv. These levels of radiation are highenough to cause severe functional damage to electronic components in themonitoring system if deployed at that site.

Another approach uses rad-hardened electronic components in such systemsto increase radiation tolerance. A rad-hardened digital integratedcircuit is a manufacturing level approach which consists in usingparticular process technologies (e.g. Silicon-on-Insulator) or/andcircuit design patterns to improve the fault tolerance. This approachcan be prohibitively expensive due to specialized semiconductormaterials used in chip fabrication, complexity in manufacturing andpackaging processes, and small market size to offset the investment inproduction. Furthermore, rad-hardened components are mostly based onproven often older technologies, and seldom match performance offered bynewer components in terms of processing speed, memory size, andultra-low power consumption expectations of modern monitoring systems.Furthermore, they are usually designed for a particular application notdesigned for scalability and therefore not reusable.

An approach to radiation tolerant architecture uses triple modularredundancy (TMR) to triplicate important circuits and subsystems, andthen rely on a majority voting system or additional circuits to detectand correct radiation induced errors. However, added elements not onlyincrease the overall system complexity, but some of them may not beredundant and are subject to faults common to all the duplicatedcircuits, termed common mode faults, being fed into the voter, possiblyresulting in a faulty output and a potential system failure. Diverse TMRmay be employed by designing functionally identical circuits each in adifferent domain to reduce the potential of common faults. However,these architectures may also not be entirely fault tolerant. Forexample, a fault tolerant platform developed for space applicationsadopts redundant architecture, but its inter-module communication andcontrol buses are non-redundant. As such, the entire system could ceaseoperation if a fault occurs in one module on the bus. In other examples,though multiple processing and memory units are used, the control logicunit has no redundancy, and it is also sensitive to radiation effects.These systems thus continue to have potential vulnerability to failure.

SUMMARY

In a general aspect the present matter provides an electronic system,for use in environments with high levels of radiation, wherein thesystem is configured to have a radiation tolerant architecture providingfault tolerant electronic circuits constructed, at least in part, withcommercial off-the-shelf (COTS) components.

In one aspect of the present matter there is provided a method for aradiation tolerant electronic system, tolerant to cumulative and singleevent radiation effects, the method comprising: selecting a group ofelectronic components that continue to be operable below a designatedcumulative radiation exposure threshold; and configuring a circuitarchitecture to employ said selected components, wherein said circuitarchitecture configuration is tolerant to said single event effects ofradiation.

In a further aspect there is provided a radiation tolerant electronicsystem architecture comprising: a plurality of redundancy channels forexecuting a circuit function, and wherein each said channel duplicatesthe circuit function with distinct and different diversity of componentsfrom a group of electronic components selected based on one or morecriteria related to radiation tolerance.

In a further aspect, of the architecture, the components are COTScomponents.

In a further aspect the architecture includes detecting and diagnosingmechanism configured in each of the plurality of channels wherein eachchannel is able to detect abnormal operation in one or more channels andprovide reconfiguration information to activate or de-active channels.

In a still further aspect the channels are arranged to form a triplemodular redundancy core of active and corresponding spare channels.

In a further aspect the architecture includes multilayer shielding eachlayer comprising different materials determined by the diversity ofselected components.

In a still further aspect the architecture includes bus and powerconfigurators for reconfiguring power and bus signals between channelsin response to signals from the diagnosing and detecting mechanism.

In a still further aspect, the configurators are implemented withpassive COTS components selected from one or more of resistors,capacitors and non-electronic relays.

In accordance the one aspect the present matter provides a method forselecting components in a radiation tolerant electronic system,comprising: determining ionizing radiation responses of COTS devicesunder various radiation conditions; selecting a subset of the COTSdevices whose radiation responses satisfy threshold radiation levels;applying mathematical models of the COTS devices for post-irradiationconditions to determine radiation responses to ionizing radiation; andimplementing a hardware circuit using COTS devices from the selectedsubset, the implemented circuit may be tested for robustness to ionizingradiation effects without repeated destructive tests of the hardwarecircuit by using the mathematical models for simulating response to theionizing radiation.

In accordance with a further aspect of the present matter there isprovided a radiation-tolerant design method for implement circuitfunctions using COTS components. In a further embodiment the presentmethod may exclusively use COTS components. In an aspect the methodincludes one or more of: understanding vulnerabilities of variouselectronic components in ionizing radiation environments; developing acircuit architecture with a redundant channel configuration. This mayinclude one or more of selecting diversified components withnon-electronics-based switches for channel selection; adding on-line andin real-time fault-detection and prognostic schemes to switch amongdifferent channels, to maintain continued operation; and usingdiversified multi-layer shielding protections to reduce a totalradiation level.

In an embodiment there is provided a radiation tolerant devicecomprising: three independent single-channel wireless devices, usingdiversified semiconductor components (such as bipolar, CMOS, and hybrid)in respective channels; and shielding protection having the singlechannel devices orientated within the shielding to minimize common modefaults.

In a further embodiment the present matter provides a method ofoperating a radiation tolerant system having redundant circuit channels,comprising: detecting failure in a channel by one or more of the failedchannels diagnosis unit or from diagnosis units in channels external tothe failed channel; and providing a reconfiguration signal from decisionmaking units in non-failed channels, based on the detection by thediagnosis units, to remove power to the failed channel and to applypower to a spare channel.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present matter will now be described by way ofexample with reference to the accompanying drawings in which likereferences denote similar elements, and in which:

FIG. 1, shows a radiation-tolerant circuit architecture according to anembodiment of the present matter;

FIG. 2 shows an abstraction of a hierarchical fault model according toan embodiment of the present matter;

FIG. 3 shows a flowchart of a master selection mechanism according to anembodiment of the present matter;

FIG. 4 shows a block diagram of the decision-making unit according to anembodiment of the present matter;

FIG. 5 shows a flowchart for the decision making unit according to anembodiment of the present matter;

FIGS. 6a and 6b show a schematic diagram of a power configurator and abus configurator respectively according to an embodiment of the presentmatter;

FIG. 7 shows a functional organization and data flow diagram for thefault detection and diagnosis in the decision-making unit according toan embodiment of the present matter;

FIG. 8 shows a block diagram for detection logic allocation according toan embodiment of the present matter;

FIG. 9 shows voltage levels of a circuit block under a fault stateaccording to an embodiment of the present matter;

FIG. 10 shows a flowchart of a fault detection loop in each channelaccording to an embodiment of the present matter;

FIG. 11 shows a flowchart of fault diagnosis according to an embodimentof the present matter;

FIG. 12a 1 shows a top view of a physical circuit board configurationaccording to an embodiment of the present matter;

FIGS. 12a 2, 12 b, 12 c and 12 d show respective layers of a multi-layerradiation shielding according to an embodiment off the present matter;

FIG. 13 shows a graph of typical radiation tolerance for selectedcomponents; and

FIG. 14 shows a block diagram of a single channel wireless monitoringsystem with three diversified channels, according to an exemplaryembodiment of the present matter.

DETAILED DESCRIPTION

In accordance with a general embodiment of the present matter there isprovided a radiation tolerant architecture to mitigate radiation damageto electronic circuits. The radiation effects may be grouped ascumulative effects and single-event-effects (SEEs). Cumulative effectsdue to total ionizing dose may be mitigated with, shielding protection,component selection, diversified hardware. SEE may be subdivided intonondestructive effects and destructive effects. Nondestructive effectsmay be mitigated with redundancy, system reset, and fault detection.Destructive effects may be mitigated with rapid power-off, redundancy,fault detection, prognostics or prediction of the component lifespan.Total ionizing dose (TID), which refers to the total amount of energydeposited by radiation particles passing through semiconductormaterials. This is also a consideration when exposing electronic devicesmade from such materials in a strong radiation environment.

According to an embodiment of the present matter there is provided amethod for investigation of radiation-tolerance of regular COTScomponents for suitability in a radiation tolerant electronic system.Most COTS-based semiconductor components may experience performancedegradation and radiation damages when the total dose is greater than athreshold radiation value, typically 20 K Rad (Si). A principle ofcomponent selection is given to obtain the suitable components, as wellas a method is proposed to assess the component reliability underradiation environments, which uses radiation degradation factors,instead of the usual failure rate data in the reliability model.Radiation degradation factor is as the input to describe the radiationresponse of a component under a total radiation dose. In addition,several typical semiconductor components are also selected as thecandidate components for the application of wireless monitoring innuclear power plants.

Referring to FIG. 1, there is shown a radiation-tolerant circuitarchitecture 100 for executing one or more selected functions in a highradiation environment according to an embodiment of the present matter.The architecture 100 includes a redundant core comprising a plurality ofindependent channels 102 each duplicating the selected function, thechannels being divided into groups of active channels (Ai) and sparechannels (Si), and configurator blocks 114, 116 for reconfiguring one ormore of power supply lines and internal busses between channels 102 inan event of a fault or failure being detected in any one of the channels102. Specifically, the configurator blocks are comprised of powerconfigurator block 114 for reconfiguring power distribution torespective channels and a bus configurator block 116 to reconfigure theplurality of communication buses comprising independent and diversifiedbuses for providing for intra-channel and extra-channel communication.

The channels 102 are notionally divided into an input layer 106,decision layer 108 and output layer 110 corresponding to respectivefunctions performed in a channel. In one embodiment of the presentmatter each channel is implemented using COTS components. In a furtherembodiment of the present matter the respective active channels andtheir corresponding spare channels are implemented with diversified COTScomponent technologies.

In the illustrated embodiment the channels 102 are grouped to formtriple modular redundancy (TMR) core of active channels (A₁, A₂, A₃),and a plurality of spare channels (S₁, S₂, S₃) each corresponding arespective active channel. In an event that an active channel hasmalfunctioned, its corresponding spare channel will be reconfigured toreplace the failed channel or channels automatically as will bedescribed herein. In other embodiments different numbers of channelgroups, numbers of active channels and numbers of spare channels may beimplemented.

The following definitions are used to describe various channel statesfor the instance of a single TMR core. Of course, more than one core maybe used depending on the needs of a particular application:

Definition 1: the three active channels and three respective spares maybe defined for one TMR core as follows. Note that in other embodimentsthere may be more than one TMR core.

A={A₁,A₂,A₃} (1≤i≤3).

where A_(i) represents the state of the i_(th) channel with A_(i)=1 and0, respectively, corresponding to its powered (active) state andnon-power state (inactive).

S={S₁,S₂,S₃} (1≤i≤3).

where S_(i) represents the state of the i_(th) spare channel withS_(i)=1 and 0, respectively, corresponding to its powered (active) stateand non-power state (inactive).

Definition 2: for channel A_(i) and S_(i), their working conditions canbe represented in the following sets:

F_(A) _(i) ={F_(A) ₁ ,F_(A) ₂ ,F_(A) ₃ }.

where F_(A) _(i) describes the state of A_(i) channel. If A_(i) iscompletely broken, then F_(A) _(i) =1, otherwise F_(A) _(i) =0.

F_(S) _(i) ={F_(S) ₁ ,F_(S) ₂ ,F_(S) ₃ }.

where F_(S) _(i) describes the state of S_(i) channel. If S_(i) iscompletely broken, then F_(S) _(i) =1, otherwise F_(S) _(i) =0.

While existing designs utilize three redundant duplicates for criticalcircuits and subsystems, they are usually followed by a majority voterto select the most desirable output or rely on extra added circuits todetect faults, a drawback is that these additional circuits themselvesare also subject to the same radiation damage. Moreover, most ofexisting fault detection and diagnosis (FDD) methods for electronicsystems mainly focus on common hardware or software faults in redundantsystems, not on cross-board (cross channel) radiation damage. Thepresent architecture 100 in contrast utilizes independent redundantchannels without relying on additional detection units and/or hardwarevoters. The architecture 100 further provides for avoidance ofcommon-mode damage between the redundant channels, and mechanisms foronline fault detection, real time preventive remedial actions, and rapidpower loss or removal. The radiation-tolerant architecture 100 also usesa decision-making unit to achieve a high level of radiation toleranceand to prolong the lifespan of COTS-based systems in radiationenvironments with a high level of radiation as will be described in moredetail below

As described above in one embodiment of the present matter thearchitecture 100 makes use non-rad-hardened commercial off-the-shelf(COTS) devices to implement circuit functions, so as to gain advantagesoffered by modern electronics. The COTS component implementedarchitecture is made more tolerant to radiation through (a) advancedcircuit design through use of redundancies and fault-tolerant operatingmodes; and (b) properly designed radiation shielding using heavymaterials against ionizing radiation. Hence, the designed system notonly supports some of the advanced functionalities comparable to astate-of-the-art system ought to offer, but also sufficiently robustagainst high level radiation such systems may for example be deployed ina nuclear power plant so that after a severe accident to the system maycontinue to provide critical information for accident mitigation.

The diversified COTS component technologies in the redundant channelsand their corresponding spare channels are implemented with diversitysemiconductor technologies, e.g., one channel uses bipolar components, asecond channel uses CMOS components, and a third channel uses hybridcomponents. Furthermore, the COTS semiconductor components are selectedfrom those COTS components having higher radiation resistance thansimilar COTS components. This may be determined by calculation of theirradiation degradation factor based on radiation test data, such that theselected component should work normally under the conditions of totaldose specified. This is generally designated to be around 20 K Rad (Si)for some implementations, although other implementations may designate adifferent threshold value.

In embodiments, as described herein, the radiation-tolerant architecture100 is configured to have independent and diversified redundancy, onlinefault-detection, real-time prognostic protection employing a prognosticalgorithm to detect, identify, and prognosticate potentialradiation-induced faults, rapid and proactive power off recovery, aswell as radiation protection techniques and diversity against thecommon-mode failure and common-mode damage avoidance.

Furthermore, methods for improving the radiation resistance of eachchannel by assessing reliability under the given radiation conditionsusing modelling techniques according to further embodiments of thepreset matter are described herein.

In selecting COTS components, photocurrent responses indicate that thephotocurrents of the ideal p-n diode under different levels of ionizingradiations can be reduced dramatically if the bias voltage on thejunction can be promptly reduced to zero. Hence, by removing power onthe junction quickly in an event of radiation exposure, a semiconductordevice may not be damaged permanently by the accumulated photocurrent.Thus, the radiation-tolerant architecture 100 includes rapid power offprotection strategies.

To ensure continued operation of the system in the event of potentialradiation damages, the present architecture 100 in one instance, makesuse of redundancy to ensure that not all channels fail at the same time.Channels are configured to detect and to prognosticate faults and errorsin a timely manner, and then locate faults and errors in order togenerate a reconfiguration decision to deal with device power loss.Furthermore, in the redundant architecture, each redundant channel iscompletely independent and does not rely on inclusion of typicaladditional measurement/test units or hardware majority voters.

As an example of a typical application consider a sender in a moderndigital remote communication/monitoring system that includes severalsubsystems such as input transducers, source encoders, channel encoders,modulators, and transmitters. In other words, the system may perform aselected function such as temperature monitoring. Self-diagnosticfunctions are integrated in the channel and therefore does not need anyadditional hardware. As mentioned above each redundant channel isnotionally divided into three layers: the input layer, the decisionlayer, and the output layer. A task of the input layer is to provide aninterface to receive information coming from inputs, such as, inputsensors, source encoders, and channel encoders to name a few.Subsequently, fault detection, fault diagnosis, prognostic assessment,and reconfiguration suggestions, are accomplished in the decision layer.The output layer then transmits and/or receives data outside theenvironment in which the system is deployed. This is typically an overthe air transmission and thus employs circuit components such as adigital modulator and transceiver. Functions of the parametermeasurement and self-diagnosis are accomplished within each redundantchannel, without additional measurement units to detect and diagnosefaults.

One of the weaknesses in redundant systems is the vulnerability tocommon-mode failures. In accordance with an embodiment of the presentarchitecture differences may be enforced for preventing common-modefailures in the present architecture. The following approaches are asfollows:

-   -   Use diversified semiconductor technologies (E₁): Bipolar devices        can withstand a higher total dose; but they are particularly        sensitive to lower dose rates. On the other hand, MOS devices        are sensitive to higher total doses and can also be robust to        lower dose rates. Devices are chosen to complement these unique        properties to cover perceivable conditions.    -   Rely on diversified, but functionally equivalent, components        (E₂): Among different channels, devices (mainly CPUs) rely on        different technologies to implement identical functions. In this        case, a microcontroller, FPGA, and/or a microprocessor are used,        as they offer different tolerance to radiation.    -   Select the same component, but from different manufacturers        (E₃): Because different manufacturing processes, such as        semiconductor materials, component size, etc., can realize the        same functionalities for certain electronic components, but with        different level of radiation tolerance, it is beneficial to        select components of the same functionalities made by different        manufacturers.    -   Use different tools for implementing different software and        algorithms for the same functionalities (E₄): Due to memory        utilization and storage locations, a same software module        developed using different programming languages and environment        may have different responses to radiation effects. In this case,        different programming environments have been used to develop        modules for different channels.

In summary, channels A_(i) and A_(j) (i=1, 2, 3; j=1, 2, 3; and i≠j) arebuilt with diversified hardware, diversified software, as well asdifferent shielding protection(described later). However, channels,A_(i) and S_(i) (i=1, 2, 3) are built with the same hardware, butdifferent software logic to achieve the same functionalities. Thus, theprotection measures used in different channels can be summarized asfollows:

$\quad\left\{ \begin{matrix}{{A_{i}\&}\mspace{14mu} {A_{j}\left( {i \neq j} \right)}\text{:}} & {{E_{1,}E_{2}},E_{3},E_{4}} \\{{A_{i}\&}\mspace{14mu} {S_{j}\left( {i \neq j} \right)}\text{:}} & {E_{1},E_{2},E_{3},E_{4}} \\{{A_{i}\&}\mspace{14mu} S_{i}\text{:}} & E_{4}\end{matrix} \right.$

Fault Detection and Diagnosis

Even though measures have been taken, as described herein, at the systemdesign and component selection processes, there is still a possibilitythat the system will not function trouble-free. To further improve thereliability of the system, real-time fault detection and diagnosisschemes are described herein according to embodiments of the presentmatter, so that remedial actions may be taken during operation torestore system performance, for example by a rapid power reset.

Referring to FIG. 2 there is shown an abstraction of a hierarchicalfault model 200, according to an embodiment of the present matter.Radiation induced disturbances and/or other disturbances will directlyaffect the system at the device level, after which the disturbances willbe transmitted to the circuit level and system level (subsystem). Faultsat the device level (L1) correspond to sensors and semiconductorcomponents; faults at the circuit level (L2) correspond to analogcircuits, digital circuits, and mix circuits; and faults at the systemlevel (L3) correspond to subsystems or functional modules.

As previously discussed, the system may be further configured to detectand prognosticate faults and errors in a timely manner, and then locatefaults and errors in order to make a reconfiguration decision to dealwith device power loss. The fault detection unit detects abnormaloperating conditions of various levels under radioactive environments,and to estimate the nature and extent of the damages. Three definitionsare given below to describe various states at device, circuit andsubsystem levels:

Definition 3 (Device): An electronic system consists of a number (n_(d))of components.

  D = {d₁, … ? ? ≤ i ≤ n_(d)).?indicates text missing or illegible when filed

where d_(i) represents the state of the i_(th) component with d_(i)=0being operational and 1 being fault states, respectively.

Definition 4 (Circuit): An electronic system consists of a number(n_(c)) of circuit modules. Each module consists of a number ofcomponents.

  C = {c₁, … ?? ≤ j ≤ n_(c)).?indicates text missing or illegible when filed

where c_(j) represents the j_(th) circuit modules in the electronicsystem. Similar representations are used to represent the operationaland fault modes as in Definition 5 in all subsequent definitions.

Definition 5 (Subsystem): An electronic system can be decomposed into anumber (n_(s)) of subsystems. Each subsystem consists of several circuitmodules.

S={s₁, . . . ,s_(n) _(s) }∀s_(k)(1≤k≤n_(s)).

where s_(k) represents the k_(th) subsystem.

Definition 6 (Functional State): For each circuit module and subsystem,two states can be defined:

-   -   X_(C), X_(S) represent the state that temporary fault or        recovered failure in the circuit blocks and subsystems, with x=0        for operational and 1 for temporary fault or recovered failure,        respectively.    -   Y_(C), Y_(S) represent the state that permanently fails in the        circuit modules and subsystems, with y=0 for no failure and 1        for permanent failure, respectively.

For each circuit module, the following conditions can be defined for theoperational state:

X_(c) = {x_(c₁), …  , x_(c_(n_(c)))}∀x_(c_(j))(1 ≤ j ≤ n_(c)).

If component c_(j) operates incorrectly, x_(c) _(i) =1, otherwise x_(c)_(j) =0.

Y_(c) = {y_(c₁), …  , y_(c_(n_(c)))}∀y_(c_(j))(1 ≤ j ≤ n_(c)).

If c_(j) has completely failed, y_(c) _(j) =1, otherwise y_(c) _(j) =0.

For each subsystem, the following state can be defined:

X_(s) = {x_(s₁), …  , x_(s_(n_(s)))}∀x_(s_(k))(1 ≤ k ≤ n_(s)).

If s_(k) operates incorrectly, x_(s) _(k) =1, otherwise x_(s) _(k) =0.

Y_(s) = {y_(s₁), …  , y_(s_(n_(s)))}∀y_(s_(k))(1 ≤ k ≤ n_(s)).

If s_(k) is completely failed, y_(s) _(k) =1, otherwise y_(s) _(k) =0.

Based on the above definitions, a fault hypothesis for malfunctions ofcircuit modules and subsystems can be formed in Eq. (1), where the goalis to integrate states of circuit modules and sub systems.

H=[X,Y]  (1)

where X is the summary of X_(C) and X_(S), as well as Y is the summaryof Y_(C) and Y_(S).

A detection function E(H) reflects the credibility of H as defined inEq. (2). A smaller E(H) suggests a higher credibility of H. If thedetection function is equal or greater than unity, a reconfigure commandshould be issued.

$\begin{matrix}{{E(H)} = {{\sum\limits_{j}^{n_{c}}\left( {{W_{xc_{j}}x_{c_{j}}} + {W_{{yc}_{j}}y_{c_{j}}}} \right)} + {\sum\limits_{k}^{n_{s}}{\left( {{W_{{xs}_{k}}x_{s_{k}}} + {W_{{ys}_{k}}y_{s_{k}}}} \right).}}}} & (2)\end{matrix}$

where w_(xc) _(j) , w_(yc) _(j) , w_(xs) _(k) , and w_(ys) _(k) are theweights of the discrepancy index. The range of the weights is from 0.1to 1. If w₁»w₂, its means that the discrepancy index w₁ is much moreimportant than w₂. The values of these weights are determined accordingto the significance of circuit modules and subsystems in electronicsystems.

Prognostic for Lifespan of Components

Prognosis protection provides functions: (1) to predict the behavior ofa circuit based on the present measurements, and hence to estimatewhether a module or a subsystem can remain functional before completefailure occurs; and (2) to select the most appropriate channels for theradiation environment and corresponding characteristics of thediversified hardware. A hypothesis to predict malfunction of a deviceand a circuit block can be defined as follows:

P=[p_(d),p_(c)].  (3)

where

p_(d) = {p_(d₁), …  , p_(d_(n_(d)))}

represents the state of the i_(th) device with p_(d) _(i) =0 and 1,respectively, based on the prediction of its operational and faultstates, and

p_(c) = {p_(c₁), …  , p_(c_(n_(c)))}

represents the prediction of incorrect circuit operation. If c_(j) ispredicted to operate incorrectly, then p_(c) _(j) =1, otherwise p_(c)_(j) =0.

A prognostic function E_(n)(P) can be formed to reflect the predictionstate of the credibility of P, which can be defined in Eq. (4). Asmaller E_(n)(P) suggests a higher credibility of P.

$\begin{matrix}{{E_{n}(P)} = {{\sum\limits_{j}^{n_{d}}{W_{d_{i}}p_{d_{i}}}} + {\sum\limits_{j}^{n_{c}}{W_{c_{j}}{{p_{c_{j}}\left( {{n = 1},2,3} \right)}.}}}}} & (4)\end{matrix}$

where w_(d) _(i) and w_(c) _(j) are the weights of the discrepancy indexof devices and circuit blocks.

A function can also be used to reflect whether a given specificsemiconductor technology for a specific channel can work correctly in agiven radiation environment.

R _(n) =f(s,d)(n=1,2,3).  (5)

where s is the information about the radiation environment; d is theinformation on the semiconductor technologies; and R_(n) is thepredicted channel selection. If channel n is estimated to have nocapacity to operate in the given environment for a specificsemiconductor technology, R_(n)=1, otherwise R_(n)=0.

Using the fault prognostic function, if E_(n)(P) is equal or greaterthan 1 or R_(n)=1, the reconfiguration command may be issued by thedecision-making unit.

The Mechanism of the Redundant System

Referring to FIG. 3 there is shown a flowchart of a master selectionmechanism 300 for the channels according to an embodiment of the presentmatter. At any given time, there is only one channel providing anoperation path for signals from the input layer to the output layer forthe system to function normally. This channel may be termed the primarychannel. Signals in this channel have to pass through thedecision-making unit in the decision layer which includes integratedvoting functions. The redundant channels may be termed checkers. Theymay be selected by the selection mechanism through the IO bus. Thestates of the channels can change dynamically if a fault occurs in theprimary channel. For internal information exchange among the primarychannel and its checkers, the decision-making unit uses two types ofbuses as discussed earlier: the internal bus for information exchangewith other channels; and the IO bus for selection of the primarychannel. All buses operate independently. A fault on one channel doesnot affect the operation of another channel.

Referring to FIG. 4 there is shown a block diagram of thedecision-making unit 400 according to an embodiment of the presentmatter. Information is transmitted over its internal bus to faultdetection, fault diagnosis, and fault prognostic schemes to generatesuitable reconfiguration decisions. The decision will include rapidpower-off to the failed channels. If a channel and its spare have bothfailed, a failure signal R_(Mi) is registered. This channel will bepermanently removed from the system. As mentioned early, diversity incomponents selection has been extensively used to avoid simultaneousfailures of all three channels in this system.

In particular, the R_(Mi) signals are only provided by the primarychannel under two cases. In other words when both a channel and itsspare are in a state of failure; or neither are suitable to work at agiven radiation level. In addition, it is assumed that cases of allthree channels simultaneously encountering either faults or failure canbe avoided by using a diversity of techniques.

The operating principle of the proposed system works as follows: whenone channel fails to operate, which will be detected by theself-diagnosis and/or the function external-diagnosis units, thedecision-making units in another channel will generate somereconfiguration recommendations to cut off the power in a timely mannerand its spare channel will be powered up to form a new TMR core.

Referring to FIG. 5 there is shown a flowchart 500 for decision makingin the i^(th) channel of the TMR_(i) core, according to an embodiment ofthe present matter. The decision logic unit is configured to integratethe functions of fault diagnosis and component life-span prognostics togenerate potential reconfiguration signals R_(Si) and R_(Mi).Specifically, all channels have the ability to detect, diagnose, andconfigure other channels in the TMR core until all channels have failed.

To illustrate, using the following example. If the semiconductortechnology used in one channel A_(i) has no capacity to operatecorrectly in the given radiation environment where R_(i)=0, or a channelA_(i) and its corresponding spare S_(i) have both failed, this channeland its spare will be instructed to power-off. Otherwise, only one ofthem is instructed to power-on. The active state of all channels A_(i)and S_(i) can be described in Eq. (6).

$\begin{matrix}{\quad\left\{ {\begin{matrix}{{S_{i} = {{{0\&}\mspace{14mu} A_{i}} = 0}},} & {{{{if}\mspace{14mu} F_{A_{i}}} = {{{1\&}\mspace{14mu} F_{S_{i}}} = 1}},{{{or}\mspace{14mu} R_{i}} = 1}} \\{{S_{i} = {\overset{\_}{A}}_{i}},} & {otherwise}\end{matrix}.} \right.} & (6)\end{matrix}$

The detailed logic of the reconfiguration commands are determined by theoutputs of the fault diagnosis and prognosis schemes, which areillustrated in Eq. (7) and Eq. (8). The signal R_(S) is used to switchthe power supply between the active channel and its spare; and thesignal R_(M) is used to remove the power supply of one active channeland that of its spare. If one of the detection function (E_(i)(H)),prognostic function (E_(i)(P)), and the predicted channel (R_(i))selection is set, reconfiguration commands will be issued.

$\begin{matrix}{R_{Si}\begin{matrix}{= {\overset{\_}{R}}_{Si}} & {{{{{{if}\mspace{14mu} {E_{i}(H)}} \geq {1\mspace{14mu} {or}\mspace{14mu} {E_{i}(P)}} \geq 1}\&}\mspace{14mu} R_{i}} = {0\mspace{14mu} {\left( {1 \leq i \leq 3} \right).}}}\end{matrix}} & (7) \\\left\{ {\begin{matrix}{R_{Mi} = 0} & {{{if}\mspace{14mu} R_{j}} = {{1\mspace{14mu} {or}\mspace{14mu} R_{k}} = {1\mspace{25mu} \left( {{1 \leq i},j,{k \leq 3},{i \neq j \neq k}} \right)}}} \\{R_{Mi} = 0} & {{{if}\mspace{14mu} F_{Aj}} = {{1\mspace{14mu} {and}\mspace{20mu} F_{Sj}} = {1\mspace{31mu} \left( {{1 \leq i},{j \leq 3},{i \neq j}} \right)}}} \\{R_{Mi} = 1} & \text{otherwise}\end{matrix}.} \right. & (8)\end{matrix}$

Signals for the configurator suggestions are generated by thedecision-making unit in other channels, as illustrated in Table 1.

TABLE 1 configurator signal A₁ & S₁ A₂ & S₂ A₃ & S₃ Primary R_(S1) ✓ ✓R_(S2) ✓ ✓ R_(S3) ✓ ✓ R_(M1)&R_(M2)&R_(M3) ✓

In general, it is difficult to detect online radiation response of eachsemiconductor device in an electronic system without additionalmeasurement/testing units. In the present redundant system, thedetection focuses on the detection at the circuit-level andsystem-level. All circuit modules and subsystems are monitored by theexternal channels and/or itself to rapidly remove its power when itencounters radiation damage. Then, according to the output of circuitmodules and subsystems, the damage to the component(s) may be analyzed.In a typical digital communication system, a sender is usuallyimplemented with variety semiconductor components, which is listed inColumn 2 of Table 2 below. The detailed radiation response of eachcomponent of some sample components and their related damage result onthe subsystem is listed in Columns 3, 4, and 5 of Table 2, and a faultdetection method for the component is illustrated in Column 6.

TABLE 2 The analysis of faults and detection mechanism Total ionizingdose (TID), Function Component Radiation effects Radiation responses ofcomponent Damage response Detection mechanism Input Source Voltage TIDThe degradation of V_(z), within The output voltage External detectionreference specification for high dose rate decreases, OPs work SEU, SELShort only for SEU, increasing nonfunctional. with a latchup current.Bipolar OP TID The degradation is depending OPs work External detectionon both the manufacturer and nonfunctional. The the circuitconfiguration. output of the function SEL The degradation in current ofinput source will be during irradiation. incorrect. SET To besusceptible to SET, positive SETs are expected for positive supplyvoltage, both input and supply voltages affect amplitude and durationNPN BJT TID The primary ionizing response The output of the Externaldetection of BJTs is the degradation of the function of input currentgain β (I_(c)/I_(b)), source will be particularly at the low dose-rates.incorrect. Source Voltage TID Increase of the reverse current The AD'sreference External detection Encoder reference and the changes of theforward voltage will be diode voltage. incorrect. A/D TID Electronicparameters are higher The output of the External detection converterunder high radiation dose, the functions of source part experiencesfunctional encoder will be failure at high irradiation levels.incorrect. SEU A number of least significant bits (LSBs) are masked outwith the condition of positive analog input; the LET threshold for thenegative input is significantly higher. SEL The LET threshold for SEL ishigher, no SEL was observed in some radiation tests. SEFI To cause everyconversion to be in error until they were reset by cycling power to thedevice. Channel Micro-controller TID Parameters exceed theMicrocontroller will be External detection Encoder & (CPU) maximumspecification limit nonfunctional. Decision when the dose is more than10K Rad (Si). Making& SEU, SEL, A logic gate switch, voltage SRAM willbe Internal detection Digital SEFI on transients, alteration of storednonfunctional. Modulator SRAM information, and destructive effects. SET,SEL, SETs are high current transients, Flash will be Internal detectionSEU, SEFI, possibly upset producing events; nonfunctional. TID onmemory's contents are altered Flash during the transient events. SEE onThe logical switch on GPIO The output of GPIO External detection GPIOports. port will be nonfunctional. Logic gate TID The degradation ofelectronical Microcontroller will be External detection parametersduring high nonfunctional irradiation level; the part is functional andstays within the specification limit. SEU, SEL A logic gate switch,destructive effects occur. Transceiver Voltage TID Increase of thereverse current Wireless transmitter Internal detection reference andthe changes of the forward will not work. diode voltage. Varactor TIDIncrease of the reverse current Wireless transmitter Internal detectionnut not of a serious degree, and will not work. the forward-voltage dropnot essentially change. Wireless TDI The failure of functions. Wirelesstransmitter External detection transmitter will be nonfunctional.

As illustrated in Table 2, when radiation effects on semiconductorcomponents happen, the function of related circuit block and/orsubsystem may not work or be nonfunctional. Then, through the externaldetection and/or the internal detection of the nonfunctional ofsubsystems, semiconductor component can be online monitored.Subsequently, the decision-making unit generates reconfigurationsuggestions to rapidly remove the power of its channel and to power onits spares. The system will not work when all redundant channel aredamaged.

Referring to FIGS. 6a and 6b there is shown a schematic diagram of thepower configurator 114 and the bus configurator 116 respectively. Bothconfigurators are hardware switches and have radiation resistance higherthan that of all redundant channels. This is so that the configurators114, 116 do not contribute to a weak point in the radiation tolerance ofsystem. In one embodiment the power configurator 114 is comprised of atleast a pair of switches to control the power supply to each channel, aswell as the location of internal buses (the bus configurator 116), whichare determined by the reconfigure suggestions (R_(Si) & R_(Mi)). Thepower configurator 114 is configured to guarantee that the system onlyever has three channels working simultaneously. Recall that fromphotocurrent studies p-n junctions are less likely to sustain permanentdamage from ionization radiation if the junctions are unpowered. Henceby having at most three channels powered minimizes the risk of damage tothe remaining channels. As shown in FIG. 6 a, the bus configurator 116,serves as the independent communication mechanism. This way, the buswill not affect other channels when one channel fails. In addition,V_(in_1), V_(in_2), V_(in_3), and V_(in_r) are the power inputs to theredundant channels and relays, and V_(Ai), V_(Si) are the power suppliesfor the TMR core active channels A_(i) and spare channels S_(i), whichare controlled by reconfigure commands (R_(Si) & R_(Mi)). In addition,the system has independent and diversified buses: internal bus (labelledCombus in FIG. 6b ), to exchange information with other channels; andthe IO bus (labelled IO bus in FIG. 6b ), to accomplish the selection ofprimary channels.

To ensure reliable operation under given radiation conditions, both thebus configurator 116 and the power configurator 114 may have higherlevel of radiation tolerance than the rest of the electronic componentsin the system. Thus, both units may be designed using passive devicesonly, such as resistors (tolerant in at least a range of 10⁴-10¹⁰ Gy),capacitors (tolerant in at least a range of 10⁴-10⁸ Gy), andnon-electronic relays (tolerant in at least a range of 10⁵-10⁷ Gy).

As previously discussed, there is a difficulty in diagnosing radiationdamage in electronic systems due to a lack of the self-diagnosisarchitecture and the online diagnosis methods of post-irradiationresponses. Most existing fault detection and diagnosis (FDD) methods forelectronic systems mainly focus on common hardware faults in redundantsystems, not on cross-board (cross channel) radiation damage. Somemodel-based FDD methods have been considered, but it is not a trivialtask to develop accurate models to deal with potential failure modescaused by radiation. Moreover, those methods usually detect and diagnosefault occurrences by using additional measurement/test units or majorityvoters, which as previously discussed are also affected and damaged byradiation. Therefore, it represents a major weakness in the wholesystem. Typical systems fail to deal with the detection and diagnosis ofradiation damage as follows:

-   -   analysis and identification of fault, error, and failure of        devices and circuits under the given radiation condition.    -   online logic to detect radiation damages and a real-time        algorithm to diagnose and to locate radiation damages.    -   validation of the developed detection method without physical        radiation test in the design phase.

According in another embodiment of the present matter there is describeda system and method for combining the radiation-tolerant architecturewith online detection and diagnosis to timely identify/locate radiationdamage in the system which may prolonging the life of the system.

Referring to FIG. 7 there is shown a functional organization and dataflow diagram 700 for the fault detection and diagnosis in the decisionmaking unit 400 referred to FIG. 4. The functions include two parts: (a)database creation and (b) real-time fault detection and prognosis fordecision-making. In the first part, data specifications of theelectronic components, boundaries of faults, errors, and failure areobtained to create an alarm database. Such information is used to createa fault detection hypothesis test framework. During online operation,measurements are then used to test the hypothesis, subsequently, togenerate appropriate decisions in the decision-making unit 400 forcontrol of the configurators 114, 116.

If a state of each level is defined as x_(i) (i=1, 2, 3). The model canbe described as follows:

$\begin{matrix}{\quad\left\{ \begin{matrix}{{x_{1}\left( {k + 1} \right)} = {{\left( {A + {\Delta A}} \right){x(k)}} + {\left( {B + {\Delta B}} \right){u(k)}} + {B_{r}{n_{r}(k)}} + {B_{o}{n_{o}(k)}}}} \\{{x_{2}\left( {k + 1} \right)} = {\left( {C + {\Delta \; C}} \right)\left( {{x_{1}(k)} + {\Delta x_{1}}} \right)}} \\{{x_{3}\left( {k + 1} \right)} = {{y\left( {k + 1} \right)} = {\left( {D + {\Delta D}} \right)\left( {{x_{2}(k)} + {\Delta x_{2}}} \right)}}}\end{matrix} \right.} & \left( {7\text{-}1} \right)\end{matrix}$

where

x₁₀₀ (k)∈R^(n), u(k)∈R^(m), y(k)∈R^(p), n_(r)(k)∈R^(l) ^(r) ,n_(o)(k)∈R^(l) ^(o)

is the state of the different levels, the input, radiation fault, andthe component/parameter fault, respectively. A, B, C, D are knownparameter matrices; and ΔA, ΔB, ΔC, ΔD, Δx₁, Δx₂ are unknown fault anderrors. As previously described, the system should detect and diagnosefaults and errors (ΔA, ΔB, ΔC, ΔD, Δx₁, Δx₂) in a timely manner. Anumber of assumptions for faults are listed as follows:Each component is either functioning, fault and failure; each circuitblock is functional, operating incorrectly, and failure; each subsystemis functional, operating incorrectly, and failure; all components arefunctional at an initial moment.

As described earlier, definitions were provided for devices (D),Circuits (C) and Subsystems (S) along with their various states. Thefollowing are additional definitions are provided:

Definition 7 (Logic Action): R_(dc) is the relation from set D to set C,and R_(cs) is the relation from set C to set S. The entries of R_(dc)and R_(cs) are defined by:

$M_{{dc}_{i,j}} = \left\{ {{{\begin{matrix}{1,} & {\left( {d_{i},c_{j}} \right) \in R_{dc}} \\{0,} & {\left( {d_{i},c_{j}} \right) \notin R_{dc}}\end{matrix}.{and}}M_{{cs}_{j,k}}} = \left\{ {\begin{matrix}{1,} & {\left( {c_{j},s_{k}} \right) \in R_{cs}} \\{0,} & {\left( {c_{j},s_{k}} \right) \notin R_{cs}}\end{matrix}.} \right.} \right.$

Thus, the relation from set D to set S can be expressed by:

M _(ds) _(i,k) =M _(dc) _(i,j) ×M _(cs) _(j,k) .

Definition 8 (Fault Set): for the circuit block c_(j), the fault set is

Fc_(j) = {Fc_(j)₁, …  , Fc_(j)_(n_(d))}.

Fc_(j) _(i) describes ionizing radiation effects of the i_(th) componentd_(i) to the circuit block c_(j). Fc_(j) ₀ denotes the functional stateof the circuit block c_(j), which considers components tolerance effect.

Fc_(j) _(i) =0 if M_(i,j)=0.

For the subsystem s_(k), the fault set is

Fs_(k) = {Fs_(k)₁, …  , Fs_(k)_(n_(c))}.

Fs_(k) _(j) describes that ionizing radiation effects of j_(th) circuitblock c_(j) to the sub-system s_(k). Fs_(k) ₀ denotes the functionalstate of the subsystem s_(k).

Fs_(k) _(j) =0 if M_(j,k)=0.

Identification of Fault, Error, and Failure

The identification focuses on analog and mixed circuit blocks withcertain input. Suppose that u is the measured voltage of the output ofone circuit block (c_(j)). An ambiguity region of the output of thecircuit block (c_(j)) for all components d_(i) can be created in timedomain.

u^(d_(i))(t) = {u^(d₁)(t), u^(d₂)(t), …  , u^(d_(n_(d)))(t)}.

with

u ^(d) ^(i) (t)=0 if M _(i,j)=0(1≤i≤n _(d),1≤j≤n _(c)).

In general, the element value with component tolerance is changed from Yto Y+ΔY. The upper and lower envelopes of the output of the circuitblock (c_(j)) for all component responses are:

u_(upper)^(d)(t) = {max (u^(d₁)(t), max (u^(d₂)(t)), …  , max (u^(d_(n_(d)))(t)))}.andu_(lower)^(d)(t) = {min (u^(d₁)(t)), min (u^(d₂)(t)), …  , min (u^(d_(n)_(d))(t))}.

Thus, the response for the functional state of the circuit block (c_(j))is:

u _(lower) ^(d)(t)≤u(t)≤u _(upper) ^(d)(t).

On the other hand, for the output of the circuit block (c_(j)) of eachcomponent d_(i) under the condition of the fault, error, and failure(u_(fault) ^(d) ^(i) , i_(err) ^(d) ^(i) , and u_(fail) ^(d) ^(i) ), theupper and lower envelopes of the circuit c_(j) output for sensitivecomponent d_(i) under the fault state are:

u _(fault) ^(d) ^(i) (t)≤u(t)≤u _(err) ^(d) ^(i) (t).  (7-2)

u _(err) ^(d) ^(j) (t)≤u(t)≤u _(fail) ^(d) ^(j) (t).  (7-3)

The fault, error, and failure of the circuit block (c_(j)) responseu_(fault) ^(c) ^(j) , u_(err) ^(c) ^(j) , and u_(fail) ^(c) ^(j) canalso be obtained. The upper and lower envelopes of the fault state ofthe circuit c_(j) response is:

u _(fault) ^(c) ^(j) (t)≤u(t)≤u _(err) ^(c) ^(j) (t).  (7-4)

The upper and lower envelopes of the broken state of the circuit block(c_(j)) response are:

u _(err) ^(c) ^(j) (t)≤u(t)≤u _(fail) ^(c) ^(j) (t).  (7-5)

According to Eq. (7-2)-Eq. (7-5), malfunction of components and circuitblocks in analog and mixed circuits can be classified into severaltypes:

-   -   Component operates incorrectly, the output of related circuit        block should be range from u_(fault) ^(d) ^(i) to u_(err) ^(d)        ^(i) ;    -   Component fails, the output of related circuit block should be        range from u_(err) ^(d) ^(i) to u_(fail) ^(d) ^(i) ;    -   Circuit block operates incorrectly, the output of circuit block        should be range from u_(fault) ^(c) ^(j) to u_(err) ^(c) ^(j) ;    -   Circuit block is broken; the output of circuit block should be        range from u_(err) ^(c) ^(j) to u_(fail) ^(c) ^(j) .

Fault diagnosis in analog and mixed circuits aims to identify thecurrent state of the circuit block according to the measured value u. Ifu is within the neighborhood of the nominal value under fault F_(i), thesimilarity between the current state and fault F_(i) is high. On theother hand, if u is out of the neighborhood, the similarity will be low.U_(F) _(i) (u) is used to express the similarity between the currentstate and fault F_(i) state. According to the maximum degree ofcriterion, if fault F_(i) satisfies

$\begin{matrix}{{U_{F_{i}}(u)} = {\max {\left\{ {{U_{F_{0}}(u)},{U_{F_{1}}(u)},{U_{F_{2}}(u)},\ldots \mspace{14mu},{U_{F_{n_{d}}}(u)}} \right\}.}}} & \left( {7\text{-}6} \right)\end{matrix}$

Then we can deem that u is subordinate to F_(i), and the current stateis more similar with fault F_(i) state.

According to the characteristics of different circuit blocks and/orsubsystems, the method of the determination of U_(F) _(i) (u) can beseparated into internal detection and external detection.

Referring to FIG. 7 there is shown a general framework for faultdetection and diagnosis schemes 700 according to an embodiment of thepresent matter. As described above, channels may be composed of devices,circuits, and subsystems. Damage to the device propagates to the circuitand the subsystem. According to the characteristics of circuit modulesand subsystems, detection of fault state can be carried out within itsown channel or by using the data from other channels. For circuitmodules, such as power related circuits, self-test circuits, faults canbe detected within the channel. However, other circuit modules,particularly with uncertain inputs, such as sensor inputs,sub-functional blocks, it would be a challenge to validate theirfunctionalities within the channel. The fault detection is oftenaccomplished by comparing with the measurements from other channels.These two approaches may be described as follows.

Referring to FIG. 8 there is shown a block diagram for detection logicallocation 800 according to an embodiment of the present matter.Considering first internal detection. For analog and mixed circuits withcertain input, the determinate U_(F) _(i) (u) is accomplished by thecomparison of the measured voltage with the voltage distribution underthe fault state. The voltage distribution under the fault state can beobtained from the calculation result of the identification of the fault,error, and failure. For example, suppose the voltage distribution of acircuit block (c_(j)) under the fault state is presented as shown inFIG. 9 which illustrates voltage levels of a circuit block under thefault state.

When there is free space between u^(F) ⁰ (t) and u^(F) ^(i) (t), if themeasured voltage u is located at the region of u^(F) ⁰ (t) or u^(F) ^(i)(t) then

U _(F) ₀ (u)=1 or U _(F) _(i) (u)=1.

When there has no free space between u^(F) ¹ (t) and u^(F) ² (t). If uis located at the overlap region of u^(F) ¹ (t) and u^(F) ² (t), thesimilarity between the current state and fault F₁, F₂ state can bedetermined by sensitivity analysis for d₁ and d₂.

On the other hand for external detection, for those circuit blocks withuncertain input, the determinate U_(F) ₁ (u) is accomplished bycombining with the error detection code and the voter mechanism. Theinformation of circuit blocks and subsystems can be encoded andtransmitted to the primary channel through the internal bus. Then, theprimary channel accomplishes the function of detection damages among allthree channels. As previously mentioned, the inputs of those circuitsare unknown, moreover, in high level radiation fields, radiation damagemay occur in one or two even three of the redundancies simultaneously.The detection of radiation damage in those circuits is difficult by onlyusing majority voters and/or additional test/detection units.

A filter function may be used to detect radiation damage in the threechannels according to past and present measurements, which is expressedin Eq. (7-7). The detection function will output the states of thosecircuit blocks.

[X ₁ _(j) ,X ₂ _(j) ,X ₃ _(j) ,Y ₁ _(j) ,Y ₂ _(j) ,Y ₃ _(j) ]=f(m ₁ _(j),m ₂ _(j) ,m ₃ _(j) ,p ₁ _(j) ,p ₂ _(j) ,p ₃ _(j) ).  (7-7)

where

-   -   m_(l) _(j) is the present measurement of the circuit block j in        the channel l;    -   p_(i) _(j) is the past measurement of the circuit block j in the        channel l;    -   X_(l) _(j) , Y_(i) _(j) is the state of the circuit block j in        the channel l.

Based on the above definitions, a fault hypothesis for malfunctions ofcircuit blocks and subsystems can be formed in Eq. (7-8), where the goalis to integrate states of circuit blocks and sub systems.

H=[XY].  (7-8)

where X is the summary of X_(C) and X_(S), as well as Y is the summaryof Y_(C) and Y_(S).

A detection function reflects the credibility of H as defined in Eq.(7-8). A smaller E(H) suggests a higher credibility of H . If thedetection function is equal or greater than unity, a reconfigure commandshould be issued.

$\begin{matrix}{{E(H)} = {{\sum\limits_{j}^{n_{c}}\left( {{W_{xc_{j}}x_{c_{j}}} + {W_{{yc}_{j}}y_{c_{j}}}} \right)} + {\sum\limits_{k}^{n_{s}}{\left( {{W_{{xs}_{k}}x_{s_{k}}} + {W_{{ys}_{k}}y_{s_{k}}}} \right).}}}} & \left( {7\text{-}9} \right)\end{matrix}$

where w_(xc) _(j) , w_(xc) _(j) , w_(xs) _(k) , and w_(ys) _(k) are theweights of the discrepancy index. The range of the weights is from 0.1to 1. If w₁»w₂, its means that the discrepancy index w₁ is much moreimportant than w₂. The values of these weights are determined accordingto the significance of circuit blocks and subsystems in electronicsystems.

Referring now to FIG. 10 there is shown a flowchart of a fault detectionloop 1000 in each channel according to an embodiment of the presentmatter. The states of fault hypothesis (H₁, H₂, H₃) will be timelyupdated for the calculation of detection functions (E(H₁), E(H₂), E(H₃))in each channel for all three channels. The results of fault detectionare transmitted to the diagnosis loop for the calculation of objectivefunction, then the decision-making unit generates diagnosis results andreconfigure suggestions.

Referring now to FIG. 11 there is shown a flowchart of fault diagnosis1100 according to an embodiment of the present matter. Firstly, newfault hypothesis is generated according to the system architecture.Subsequently, objective function is updated based on the results offault detection. If the objective function E(H) is equal or greater than1, U_(F) _(i) (u) and the diagnosis suggestions should be generated.

As may be seen from the above methods and system have been described toachieve radiation tolerant design according to embodiments of thepresent matter. For example, a radiation tolerant architecture wasdescribed, along with techniques for hardening the radiation tolerantarchitecture against single event effects by using redundancy, diversityin different component technologies, and fault detection and diagnosis.Further a decision-logic unit for generating decisions to reconfigurefaulty or damaged channels was also described in detail above. Effectsof TID were also mentioned above along with approaches to mitigating TIDwhich included techniques of shielding and component selection. Theselatter two techniques will now be discussed in greater detail below.

In accordance with another embodiment of the present matter radiationshielding protection with different materials is used to protect againstcommon-mode damage of the COTS-based electronic components in theradiation tolerant system. However for portability of the wirelesssystem, the size and weight of the shielding protection are alsolimited.

Referring to FIG. 12 there is shown a multi-layer radiation shielding1200 according to an embodiment off the present matter. The shielding1200 is composed of three layers of shielding as illustrated in FIG. 12b, FIG. 12c and FIG. 12d respectively The radiation shielding 1200 isconfigured to increase radiation tolerance of the system 100 whileavoiding the common-mode damage and minimize accumulated dose. Recallthat as mentioned earlier, radiation particles can change the normaloperating parameters of electronic components and alter their electricalcharacteristics, subsequently lead to functional failures. If theaccumulated dose exceeds the tolerance limit, components can suffer frompermanent damage.

Referring to FIG. 12a 1 and FIG. 12a 2 there is shown an examplephysical circuit board configuration. In the example of the TMR core 100each of the three active layers and their corresponding spares areconstructed on individual and separated circuit boards. The circuitboards are arranged at different angles 1201 with respect to each otheras for example shown in FIG. 12a 1 reducing common mode effects. It isappreciated that many other angles and relative configurations betweenthe channels may be employed. Referring to FIG. 12b there is shown thefirst layer of shielding which is composed of material that tightlyencloses the circuit boards. Referring to FIG. 12c shows a lead block1204 into which the enclosed circuit boards are embedded. Finally,referring to FIG. 12d there is shown a third layer 1206 whichencapsulate the entire system. Different materials used in each layerare determined by the type and the radiation degradation factors ofsemiconductor devices on these circuit boards.

For a given radiation source, a given radiation dose rate, and a knownshielding material, the required shielding thickness under a broadgeometry can be calculated as follows:

d=ln(B·I ₀ /I)/u.

Where linear attenuation coefficient (u) is the probability per unitthickness that particles interact with the material. This value isdependent upon the atomic number Z of the material and its density (p).The build-up factor (B), which is defined as the ratio of the intensityof the radiation at any point in a beam to the intensity of the primaryradiation only at that point. According to the equation, variousshielding materials may be selected, and their performance compared, thedesigned shielding thickness may also be evaluated to achieve the designobjective, of reducing the total dose to a level less than the chosenthreshold e.g. 20 K Rad (Si).

In an embodiment the 1st, 2^(nd) and 3rd layers may be for example lead,iron and aluminum respectively. Other combinations may be for examplerespectively: Tungsten, lead, copper or tungsten, lead, lead glass.These are by no means exhaustive combinations. The type of materialcould Theoretically, encompass all materials that may be used forradiation shielding if thick enough. The choice of the shieldingmaterial is dependent on many factors: desired attenuated radiationlevels, effectiveness of heat dissipation, resistance to radiationdamage, required thickness and weight, multiple use considerations,uniformity of shielding capability, permanence of shielding andavailability.

In accordance with a further embodiment of the present invention afurther method to mitigate effects of total ionizing dose, includeselection of components by considering radiation degradation factors indiversified components in a pool of COTS components with similarfunctionalities to achieve higher radiation resistance under the givenradiation conditions.

Component Selection

Component selection is a consideration in the design phase of COTS-basedrad-hardened systems. Radiation effects are different for variousdevices, circuits and systems. How sensitive these effects are dependenton material compositions, structure of p-n junctions, manufacturingtechnologies, and domain of intended applications. According toradiation damage thresholds and radiation tolerance known in the art asshown in FIG. 13, as well as radiation test data in the literature mostsemiconductor components will experience device degradation andradiation damages when the total dose is more than 20 K Rad (Si) (1Gy=100 Rad (Si). Therefore, the total dose limit is defined to athreshold value which in the present example is 20 K Rad (Si). Differentthresholds will depend on other test data. The radiation-resistances ofselected candidate components should be more than this threshold totaldose limit.

As an example, by referring to radiation test data shown in FIG. 13 thefollowing principles may be used in illustrating a component selectionprocess for different active channels and their corresponding sparechannels, respectively:

-   -   To implement redundant channels and their spares with diversity        semiconductor technologies, e.g., One channel uses bipolar        components, second channel uses CMOS components, and third        channel uses hybrid components;    -   To select semiconductor component with higher radiation        resistance by the calculation of its radiation degradation        factor based on radiation test data, the selected component        should work normally under the condition of total dose 20 K Rad        (Si);    -   To improve the radiation resistance of each channel by the        assessment of reliability under the given radiation conditions.

Due to the fact that semiconductor components may have a number (n_(p))of critical parameters, in an embodiment of the present matter, theradiation degradation factor is defined as the mean value of thosedegradation factors of all critical parameters, which can be describedas follows.

$\Delta = {\frac{\sum\limits_{i = 1}^{n_{p}}{\min \left\{ {{{\left( {P_{i_{o}} - P_{i_{t}}} \right)\text{/}\left( {P_{i_{o}} - P_{i_{f}}} \right)}},1} \right\}}}{n_{p}}.}$

TABLE Summary of selected candidate components and radiation degradationfactors Device Type Device Δ_(10K) Δ_(20K) Δ_(50K) Δ_(100K) BJT 2N22220.1940 0.3201 0.4267 0.4591 Voltage LT1021 0.0774 0.1010 0.2104 0.3432reference LT1009 0.0642 0.1099 0.5158 0.5786 MP5010 0.0000 0.0000 0.00000.0000 AD580 0.1510 0.0181 0.0087 0.0094 REF-10 0.1408 0.3371 0.32040.3846 AD780 0.0039 0.0229 0.0246 0.0209 TL431 0.0055 0.0269 0.02380.0646 LM117HVK 0.1639 0.2916 0.2933 0.2464 LP2951 0.1226 0.1737 0.32770.5699 UDS2983 0.3607 0.2557 0.2472 0.2541 OP amplifier CLC502 0.02080.0365 0.0383 0.0365 PA51M 0.0409 0.0770 0.2989 0.2168 LM108 0.23770.3964 0.6620 0.6537 LM136 0.0098 0.0186 0.2431 0.2593 MC35181 0.06890.1551 0.3673 0.5151 LM317 0.2970 0.4120 0.5294 0.5568 PA07M 0.13600.0764 0.1757 0.2717 OP43 0.1409 0.3128 0.4047 0.4182 AD544 0.13310.3963 0.4759 0.5132 AD713 0.3271 0.6739 0.8221 0.7451 MP3518 0.06890.1551 0.3673 0.5151 TL074 0.2402 0.3267 0.3742 0.3250 Analog-to-digitalAD574 0.0178 0.0486 0.0633 0.0649 converter AD674 0.1735 0.1503 0.27410.3345 AD7885 0.0181 0.0229 0.0246 0.0209 AD713 0.2265 0.3899 0.42860.3926 E²PROM 28C010 0.0187 0.0465 0.1001 0.1179 FPGA A1280 0.00230.0244 0.1341 0.1326 Microcontroller 82C59 0.0638 0.0654 0.0985 0.1190Logic gate 54AC02 0.0469 0.0494 0.0480 0.0724 54AC08 0.0133 0.02440.1850 0.2432

Referring to FIG. 14 there is shown an example implementation a typicalfunction in a high radiation environment. In this example the functionis a typical wireless measurement and transmission system 1400implemented in accordance with an embodiment of the present matter.Though there may be many different components and circuits, commonbuilding blocks are: signal processing circuit, analog-to-digitalconverter, microcontroller, and transceiver. However, those subsystemscan be still built with different semiconductor technologies, anddifferent components from different manufactures. As described in thepresent mater, an understanding of radiation responses of these devicesunder different radiation conditions provides information for the designof the wireless monitoring system. A number of semiconductor componentslisted in the Table above and also FIG. 13 may be selected to implementfunctional blocks in the wireless measurement and transmission system1400. As illustrated in FIG. 14 three non-redundant wireless measurementand transmission units denoted as sample1, sample, and sample3 are builtwith diversified semiconductor technologies.

Table II below is a summary of semiconductor components used inirradiated devices. Sample-1, Sample-2, and Sample-3 are selected asA₁/S₁, A₂/S₂, A₃/S₃, separately

Semiconductor Channel Device ID technology Manufacture Sample-1 LT1611Bipolar Linear Technology CLC502 National Semiconductor AD571 AnalogDevices RF2905 RF Micro Devices P89V51RC2 CMOS NXP Sample-2 REF03Bipolar Analog Devices MAX660 CMOS Texas Instruments AD674 AnalogDevices PIC16F77 Microchip SX1278 SEMTECH Sample-3 LM2662 BiCMOS TexasInstruments UA741 Bipolar STMicroelectronics AD1671 BiCMOS AnalogDevices C8051F581 TTL Logic Silicon Labs SI4463 Silicon Labs

Conclusion

A methodology is described to achieve rad-hardened design ofcommunication systems without relying on rad-hardened semiconductordevices. Embodiments of the system provide for the use one or more ofdiversified component selection, redundancy in switchable communicationchannels, real-time fault-detection, and multi-layer shieldingprotection so that COTS components may achieve reliable operation inhigh radiation environments. Embodiments provide for a modern monitoringsystem through use of commercial off-the-shelf components for strongradiation environments is described herein. Reference is also made to apaper entitled “A Radiation-Tolerant Wireless Monitoring System Using aRedundant Architecture and Diversified Commercial Off-the ShelfComponents, Q. Huang et.al, published in the IEEE Transactions OnNuclear Science Vol.65. No.9 September 2018, pages 2582-2592, theentirety of which is incorporated herein by reference.

1. A method for selecting components in a radiation tolerant electronicsystem, comprising: determining ionizing radiation responses of COTSdevices under various radiation conditions; selecting a subset of theCOTS devices whose radiation responses satisfy threshold radiationlevels; applying mathematical models of the COTS devices forpost-irradiation conditions to determine radiation responses to ionizingradiation; and implementing a hardware circuit using COTS devices fromthe selected subset, the implemented circuit may be tested forrobustness to ionizing radiation effects without repeated destructivetests of the hardware circuit by using the mathematical models forsimulating response to the ionizing radiation.
 2. A method for aradiation tolerant electronic system subject to cumulative and singleevent radiation effects, the method comprising: selecting a group ofelectronic components operable below a cumulative radiation exposurethreshold; and implementing a circuit architecture employing saidcomponents, wherein the circuit architecture is configured to betolerant to the single event effects of radiation.
 3. The method ofclaim 2, wherein the configured architecture includes a plurality ofredundancy channels for executing a circuit function, and wherein eachsaid channel duplicates the circuit function with distinct and differentdiversity of components from said group of selected electroniccomponents.
 4. The method of claim 3, wherein the configuredarchitecture includes a detecting and diagnosing mechanism configured ineach channel wherein each channel may detect abnormal operation in anychannel and provide reconfiguration information to activate or de-activechannels.
 5. The method of claim 3, wherein the channels are arranged toform a triple redundancy core of active channels and corresponding sparechannels.
 6. The method of claim 2, including using multilayer shieldingwherein each layer of shielding having different materials determined bythe diversity of selected components.
 7. The method of claim 3, whereinthe configured architecture includes bus and power configurators forreconfiguring power and bus signals between channels in response tosignals from the diagnosing and detecting mechanism.
 8. The method ofclaim 7, wherein the configurators are implemented with passive COTScomponents selected from one or more of resistors, capacitors andnon-electronic relays.
 9. A radiation-tolerant design method forimplement circuit functions using COTS components, the method including:determining life spans of electronic components in ionizing radiationenvironments; and developing a circuit architecture with a redundantchannel configuration selection, including one or more of selectingdiversified components with non-electronics-based switches for channelselection; adding on-line and in real-time fault-detection andprognostic schemes to switch among different channels to maintaincontinued operation; and using diversified multi-layer shieldingprotections to reduce a total cumulative radiation level.
 10. A methodof operating a radiation tolerant system having redundant circuitchannels, comprising detecting failure in a channel by the failedchannel's diagnosis unit or from diagnosis units in channels external tothe failed channel; and providing a reconfiguration signal from decisionmaking units in non-failed channels, based on the detection by thediagnosis units, to remove power to the failed channel and to applypower to a spare channel.
 11. A radiation tolerant electronic systemarchitecture comprising: A plurality of independent circuit channelseach duplicating a designated circuit function, the channels beingdivided into groups of active channels and spare channels; and whereineach said channel duplicates the circuit function with distinct anddifferent diversity of components from a group of electronic componentsselected based on one or more criteria related to at least radiationtolerance.
 12. The radiation tolerant electronic system architecture ofclaim 11, wherein the components are COTS components.
 13. The radiationtolerant electronic system architecture of claim 11, including adetecting and diagnosing mechanism configured in each of the pluralityof channels wherein each channel is able to detect abnormal operation inone or more channels and provide reconfiguration information to activateor de-active channels.
 14. The radiation tolerant electronic systemarchitecture of claim 13, including bus and power configurators forreconfiguring power and bus signals between channels in response tosignals from the diagnosing and detecting mechanism.
 15. The radiationtolerant electronic system architecture of claim 13, includingdiversified multi-layer shielding protections to reduce a totalradiation levels.
 16. The radiation tolerant electronic systemarchitecture of claim 15, including physically arranging components ofrespective channels at different angles in the multilayer shielding tominimize common-mode radiation damage.